simdutf_connector: in_tail: skip UTF-16/UTF-8 BOM by erikced · Pull Request #10328 · fluent/fluent-bit

erikced · 2025-05-13T09:56:43Z

This MR updates simdutf_connector to reduce the number of copies when converting UTF-16 to UTF-8 and to remove the UTF-16 BOM prior to conversion so that no UTF-8 BOM is present in the converted output. tail_file is also updated to skip any encountered UTF-8 BOM if the unicode conversion returns FLB_UNICODE_CONVERT_NOP.

Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

[N/A] Example configuration file for the change
Debug log output from testing the change

Fluent Bit v4.0.2 | NIGHTLY_BUILD=0 - DO NOT USE IN PRODUCTION!
* Copyright (C) 2015-2025 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

______ _                  _    ______ _ _             ___  _____ 
|  ___| |                | |   | ___ (_) |           /   ||  _  |
| |_  | |_   _  ___ _ __ | |_  | |_/ /_| |_  __   __/ /| || |/' |
|  _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / / /_| ||  /| |
| |   | | |_| |  __/ | | | |_  | |_/ / | |_   \ V /\___  |\ |_/ /
\_|   |_|\__,_|\___|_| |_|\__| \____/|_|\__|   \_/     |_(_)___/ 


[2025/05/13 09:35:34] [ info] Configuration:
[2025/05/13 09:35:34] [ info]  flush time     | 10.000000 seconds
[2025/05/13 09:35:34] [ info]  grace          | 5 seconds
[2025/05/13 09:35:34] [ info]  daemon         | 0
[2025/05/13 09:35:34] [ info] ___________
[2025/05/13 09:35:34] [ info]  inputs:
[2025/05/13 09:35:34] [ info]      tail
[2025/05/13 09:35:34] [ info]      tail
[2025/05/13 09:35:34] [ info] ___________
[2025/05/13 09:35:34] [ info]  filters:
[2025/05/13 09:35:34] [ info] ___________
[2025/05/13 09:35:34] [ info]  outputs:
[2025/05/13 09:35:34] [ info]      stdout.0
[2025/05/13 09:35:34] [ info] ___________
[2025/05/13 09:35:34] [ info]  collectors:
[2025/05/13 09:35:34] [ info] [fluent bit] version=4.0.2, commit=3c8f9f27e3, pid=1
[2025/05/13 09:35:34] [debug] [engine] coroutine stack size: 24576 bytes (24.0K)
[2025/05/13 09:35:34] [ info] [storage] ver=1.5.3, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2025/05/13 09:35:34] [ info] [simd    ] disabled
[2025/05/13 09:35:34] [ info] [cmetrics] version=1.0.2
[2025/05/13 09:35:34] [ info] [ctraces ] version=0.6.6
[2025/05/13 09:35:34] [ info] [input:tail:mssql-tail-input] initializing
[2025/05/13 09:35:34] [ info] [input:tail:mssql-tail-input] storage_strategy='memory' (memory only)
[2025/05/13 09:35:34] [debug] [tail:mssql-tail-input] created event channels: read=25 write=26
[2025/05/13 09:35:34] [ info] [input:tail:mssql-tail-input] adjusted buf_max_size to 128001
[2025/05/13 09:35:34] [ info] [input:tail:mssql-tail-input] adjusted buf_chunk_size to 32769
[2025/05/13 09:35:34] [debug] [input:tail:mssql-tail-input] flb_tail_fs_inotify_init() initializing inotify tail input
[2025/05/13 09:35:34] [debug] [input:tail:mssql-tail-input] inotify watch fd=31
[2025/05/13 09:35:34] [debug] [input:tail:mssql-tail-input] scanning path /tmp/ERRORLOG-LE
[2025/05/13 09:35:34] [debug] [input:tail:mssql-tail-input] file will be read in POSIX_FADV_DONTNEED mode /tmp/ERRORLOG-LE
[2025/05/13 09:35:34] [debug] [input:tail:mssql-tail-input] inode=3266809 with offset=0 appended as /tmp/ERRORLOG-LE
[2025/05/13 09:35:34] [debug] [input:tail:mssql-tail-input] scan_glob add(): /tmp/ERRORLOG-LE, inode 3266809
[2025/05/13 09:35:34] [debug] [input:tail:mssql-tail-input] 1 new files found on path '/tmp/ERRORLOG-LE'
[2025/05/13 09:35:34] [ info] [input:tail:mssql-tail-input] initializing
[2025/05/13 09:35:34] [ info] [input:tail:mssql-tail-input] storage_strategy='memory' (memory only)
[2025/05/13 09:35:34] [debug] [tail:mssql-tail-input] created event channels: read=33 write=34
[2025/05/13 09:35:34] [ info] [input:tail:mssql-tail-input] adjusted buf_max_size to 128001
[2025/05/13 09:35:34] [ info] [input:tail:mssql-tail-input] adjusted buf_chunk_size to 32769
[2025/05/13 09:35:34] [debug] [input:tail:mssql-tail-input] flb_tail_fs_inotify_init() initializing inotify tail input
[2025/05/13 09:35:34] [debug] [input:tail:mssql-tail-input] inotify watch fd=39
[2025/05/13 09:35:34] [debug] [input:tail:mssql-tail-input] scanning path /tmp/ERRORLOG-BE
[2025/05/13 09:35:34] [debug] [input:tail:mssql-tail-input] file will be read in POSIX_FADV_DONTNEED mode /tmp/ERRORLOG-BE
[2025/05/13 09:35:34] [debug] [input:tail:mssql-tail-input] inode=4922164 with offset=0 appended as /tmp/ERRORLOG-BE
[2025/05/13 09:35:34] [debug] [input:tail:mssql-tail-input] scan_glob add(): /tmp/ERRORLOG-BE, inode 4922164
[2025/05/13 09:35:34] [debug] [input:tail:mssql-tail-input] 1 new files found on path '/tmp/ERRORLOG-BE'
[2025/05/13 09:35:34] [debug] [stdout:stdout.0] created event channels: read=41 write=42
[2025/05/13 09:35:34] [debug] [router] match rule tail.0:stdout.0
[2025/05/13 09:35:34] [debug] [router] match rule tail.1:stdout.0
[2025/05/13 09:35:34] [ info] [sp] stream processor started
[2025/05/13 09:35:34] [debug] [input:tail:mssql-tail-input] [static files] processed 1.4K
[2025/05/13 09:35:34] [debug] [input:tail:mssql-tail-input] [static files] processed 1.4K
[2025/05/13 09:35:34] [ info] [input:tail:mssql-tail-input] inode=3266809 file=/tmp/ERRORLOG-LE ended, stop
[2025/05/13 09:35:34] [debug] [input:tail:mssql-tail-input] inode=3266809 file=/tmp/ERRORLOG-LE promote to TAIL_EVENT
[2025/05/13 09:35:34] [ info] [input:tail:mssql-tail-input] inotify_fs_add(): inode=3266809 watch_fd=1 name=/tmp/ERRORLOG-LE
[2025/05/13 09:35:34] [debug] [input:tail:mssql-tail-input] [static files] processed 0b, done
[2025/05/13 09:35:34] [ info] [input:tail:mssql-tail-input] inode=4922164 file=/tmp/ERRORLOG-BE ended, stop
[2025/05/13 09:35:34] [debug] [input:tail:mssql-tail-input] inode=4922164 file=/tmp/ERRORLOG-BE promote to TAIL_EVENT
[2025/05/13 09:35:34] [ info] [input:tail:mssql-tail-input] inotify_fs_add(): inode=4922164 watch_fd=1 name=/tmp/ERRORLOG-BE
[2025/05/13 09:35:34] [debug] [input:tail:mssql-tail-input] [static files] processed 0b, done
[2025/05/13 09:35:34] [debug] [task] created task=0x7f480e636e60 id=0 OK
[2025/05/13 09:35:34] [debug] [output:stdout:stdout.0] task_id=0 assigned to thread #0
[2025/05/13 09:35:34] [debug] [task] created task=0x7f480e636f00 id=1 OK
[2025/05/13 09:35:34] [debug] [output:stdout:stdout.0] task_id=1 assigned to thread #0
[2025/05/13 09:35:34] [ warn] [engine] service will shutdown in max 5 seconds
[2025/05/13 09:35:34] [debug] [engine] retry=0x5 for task 0 already scheduled to run, not re-scheduling it.
[2025/05/13 09:35:34] [debug] [engine] retry=0x5 for task 1 already scheduled to run, not re-scheduling it.
[2025/05/13 09:35:34] [ info] [input] pausing mssql-tail-input
[2025/05/13 09:35:34] [ info] [input] pausing mssql-tail-input
[2025/05/13 09:35:34] [ warn] [engine] service will shutdown in max 5 seconds
[2025/05/13 09:35:34] [debug] [engine] retry=0x5 for task 0 already scheduled to run, not re-scheduling it.
[2025/05/13 09:35:34] [debug] [engine] retry=0x5 for task 1 already scheduled to run, not re-scheduling it.
[2025/05/13 09:35:34] [ info] [input] pausing mssql-tail-input
[2025/05/13 09:35:34] [ info] [input] pausing mssql-tail-input:qjk
[0] mssql.errorlog.le: [[1747128934.238965455, {}], {"filename"=>"/tmp/ERRORLOG-LE", "log"=>"2025-03-29 03:24:27.73 Server      Microsoft SQL Server 2022 (RTM-CU9) (KB5030731) - 16.0.4085.2 (X64) "}]
[1] mssql.errorlog.le: [[1747128934.238969253, {}], {"filename"=>"/tmp/ERRORLOG-LE", "log"=>"	Sep 27 2023 12:05:43 "}]
[2] mssql.errorlog.le: [[1747128934.238970461, {}], {"filename"=>"/tmp/ERRORLOG-LE", "log"=>"	Copyright (C) 2022 Microsoft Corporation"}]
[3] mssql.errorlog.le: [[1747128934.238971867, {}], {"filename"=>"/tmp/ERRORLOG-LE", "log"=>"	Standard Edition (64-bit) on Windows Server 2022 Datacenter 10.0 <X64> (Build 20348: ) (Hypervisor)"}]
[4] mssql.errorlog.le: [[1747128934.238972427, {}], {"filename"=>"/tmp/ERRORLOG-LE", "log"=>""}]
[5] mssql.errorlog.le: [[1747128934.238974464, {}], {"filename"=>"/tmp/ERRORLOG-LE", "log"=>"2025-03-29 03:24:27.73 Server      UTC adjustment: 0:00"}]
[6] mssql.errorlog.le: [[1747128934.238976188, {}], {"filename"=>"/tmp/ERRORLOG-LE", "log"=>"2025-03-29 03:24:27.73 Server      (c) Microsoft Corporation."}]
[7] mssql.errorlog.le: [[1747128934.238977793, {}], {"filename"=>"/tmp/ERRORLOG-LE", "log"=>"2025-03-29 03:24:27.73 Server      All rights reserved."}]
[8] mssql.errorlog.le: [[1747128934.238979542, {}], {"filename"=>"/tmp/ERRORLOG-LE", "log"=>"2025-03-29 03:24:27.73 Server      Server process ID is 2948."}]
[9] mssql.errorlog.le: [[1747128934.238982076, {}], {"filename"=>"/tmp/ERRORLOG-LE", "log"=>"2025-03-29 03:24:27.74 Server      System Manufacturer: 'Microsoft Corporation', System Model: 'Virtual Machine'."}]
[10] mssql.errorlog.le: [[1747128934.238983865, {}], {"filename"=>"/tmp/ERRORLOG-LE", "log"=>"2025-03-29 03:24:27.74 Server      Authentication mode is MIXED."}]
[0] mssql.errorlog.be: [[1747128934.239409586, {}], {"filename"=>"/tmp/ERRORLOG-BE", "log"=>"2025-03-29 03:24:27.73 Server      Microsoft SQL Server 2022 (RTM-CU9) (KB5030731) - 16.0.4085.2 (X64) "}]
[1] mssql.errorlog.be: [[1747128934.239412160, {}], {"filename"=>"/tmp/ERRORLOG-BE", "log"=>"	Sep 27 2023 12:05:43 "}]
[2] mssql.errorlog.be: [[1747128934.239413323, {}], {"filename"=>"/tmp/ERRORLOG-BE", "log"=>"	Copyright (C) 2022 Microsoft Corporation"}]
[3] mssql.errorlog.be: [[1747128934.239414661, {}], {"filename"=>"/tmp/ERRORLOG-BE", "log"=>"	Standard Edition (64-bit) on Windows Server 2022 Datacenter 10.0 <X64> (Build 20348: ) (Hypervisor)"}]
[4] mssql.errorlog.be: [[1747128934.239415262, {}], {"filename"=>"/tmp/ERRORLOG-BE", "log"=>""}]
[5] mssql.errorlog.be: [[1747128934.239417324, {}], {"filename"=>"/tmp/ERRORLOG-BE", "log"=>"2025-03-29 03:24:27.73 Server      UTC adjustment: 0:00"}]
[6] mssql.errorlog.be: [[1747128934.239419050, {}], {"filename"=>"/tmp/ERRORLOG-BE", "log"=>"2025-03-29 03:24:27.73 Server      (c) Microsoft Corporation."}]
[7] mssql.errorlog.be: [[1747128934.239420616, {}], {"filename"=>"/tmp/ERRORLOG-BE", "log"=>"2025-03-29 03:24:27.73 Server      All rights reserved."}]
[8] mssql.errorlog.be: [[1747128934.239422207, {}], {"filename"=>"/tmp/ERRORLOG-BE", "log"=>"2025-03-29 03:24:27.73 Server      Server process ID is 2948."}]
[9] mssql.errorlog.be: [[1747128934.239424601, {}], {"filename"=>"/tmp/ERRORLOG-BE", "log"=>"2025-03-29 03:24:27.74 Server      System Manufacturer: 'Microsoft Corporation', System Model: 'Virtual Machine'."}]
[10] mssql.errorlog.be: [[1747128934.239426338, {}], {"filename"=>"/tmp/ERRORLOG-BE", "log"=>"2025-03-29 03:24:27.74 Server      Authentication mode is MIXED."}]
[2025/05/13 09:35:34] [ info] [output:stdout:stdout.0] worker #0 started
[2025/05/13 09:35:34] [debug] [out flush] cb_destroy coro_id=0
[2025/05/13 09:35:34] [debug] [out flush] cb_destroy coro_id=1
[2025/05/13 09:35:34] [debug] [task] destroy task=0x7f480e636e60 (task_id=0)
[2025/05/13 09:35:34] [debug] [task] destroy task=0x7f480e636f00 (task_id=1)
[2025/05/13 09:35:34] [ info] [engine] service has stopped (0 pending tasks)
[2025/05/13 09:35:34] [ info] [output:stdout:stdout.0] thread worker #0 stopping...
[2025/05/13 09:35:34] [ info] [input] pausing mssql-tail-input
[2025/05/13 09:35:34] [ info] [output:stdout:stdout.0] thread worker #0 stopped
[2025/05/13 09:35:34] [ info] [input] pausing mssql-tail-input
[2025/05/13 09:35:34] [debug] [input:tail:mssql-tail-input] inode=4922164 removing file name /tmp/ERRORLOG-BE
[2025/05/13 09:35:34] [ info] [input:tail:mssql-tail-input] inotify_fs_remove(): inode=4922164 watch_fd=1
[2025/05/13 09:35:34] [debug] [input:tail:mssql-tail-input] inode=3266809 removing file name /tmp/ERRORLOG-LE
[2025/05/13 09:35:34] [ info] [input:tail:mssql-tail-input] inotify_fs_remove(): inode=3266809 watch_fd=1

Attached Valgrind output that shows no leaks or memory corruption was found

==1==
==1== HEAP SUMMARY:
==1==     in use at exit: 0 bytes in 0 blocks
==1==   total heap usage: 4,618 allocs, 4,618 frees, 1,749,754 bytes allocated
==1==
==1== All heap blocks were freed -- no leaks are possible
==1==
==1== For lists of detected and suppressed errors, rerun with: -s
==1== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

[N/A] Run local packaging test showing all targets (including any new ones) build.
[N/A] Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

[N/A] Documentation required for this feature

Backporting

[N/A] Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

leonardo-albertovich · 2025-05-13T10:22:27Z

Could you please take a look at this @cosmo0920? There are a few coding style issues but I'm more interested validating in the actual UTF stuff.

cosmo0920

I tested the patch in my dev box and I got succeeded to convert from UTF-16-LE and UTF-16-BE to UTF-8. And I did use std::unique_ptr<char[]> to proceed allocate/deallocate automatically in the simdutf module. Changed from it to flb_malloc style was not considered TBH. This could be fine and there's nothing memory leaks.

cosmo0920 · 2025-05-13T10:55:19Z

Ah, jemalloc headers are not found in CI tasks. Need to investigate.

cosmo0920 · 2025-05-13T10:55:57Z

Could you please take a look at this @cosmo0920? There are a few coding style issues but I'm more interested validating in the actual UTF stuff.

It's fine to me for this change. Could you proceed to point out minor issues on your side?

cosmo0920

One thing, could you add the following lines into end of the file here?

if(FLB_JEMALLOC)
  target_link_libraries(flb-simdutf-connector-static ${JEMALLOC_LIBRARIES})
endif()

It seems jemalloc related error could be caused by missing dependency of jemalloc. So, we have to mark jemalloc as one of the dependencies of simdutf-connector.

erikced · 2025-05-13T11:39:49Z

One thing, could you add the following lines into end of the file here?
if(FLB_JEMALLOC)
  target_link_libraries(flb-simdutf-connector-static ${JEMALLOC_LIBRARIES})
endif()
It seems jemalloc related error could be caused by missing dependency of jemalloc. So, we have to mark jemalloc as one of the dependencies of simdutf-connector.

Done. Thanks for swift feedback and help with deciphering the build errors.

cosmo0920 · 2025-05-14T06:55:44Z

I identified the weird compilation errors on Windows here:
monkey/monkey#423
This could be caused by old implementation but it was correct at that time. So, we need to fix them first in monkey repo.

leonardo-albertovich

I've left some change requests, please check all of the code for those issues I've pointed out, I tried not to add one note per incidence but noticed multiple occurrences of some of them (such as the missing exception handling).

leonardo-albertovich · 2025-05-14T10:59:33Z

plugins/in_tail/tail_file.c

        else if (ret == FLB_UNICODE_CONVERT_NOP) {
            flb_plg_debug(ctx->ins, "nothing to convert encoding '%.*s'", end - data, data);
+            /* Skip the UTF-8 BOM */
+            if ((end - data) >= 3 && (data[0] & 0xFF) == 0xEF && (data[1] & 0xFF) == 0xBB


I think in this branch of the conditional the buffer has not changed and thus file->buf_len is still valid which means the conditional should be written as :

if (file->buf_len >= 3 && (data[0] & 0xFF) == 0xEF && (data[1] & 0xFF) == 0xBB && (data[2] & 0xFF) == 0xBF) {

Additionally, is there any reason for us not to define data as unsigned char * so we can simplify this?

That seems accurate, I just took the expression from the line above. Changing it where, data originally comes from the flb_tail_file struct. In this case I think an easer solution would be to use char constants instead, e.g. '\xFF' which makes the integer promotion behave the same way for both the lhs and rhs of the comparison.

leonardo-albertovich · 2025-05-14T11:03:06Z

src/simdutf/flb_simdutf_connector.cpp

-    result = simdutf::validate_utf8_with_errors(output.get(), clen);
-    if (result.error == simdutf::error_code::SUCCESS && converted > 0) {
-        std::string result_string(output.get(), clen);
+    *utf8_output = (char*)flb_malloc(clen + 1);


Please check the coding style guide for this and other issues.

In this case the missing spaces in the data type type and after the closing parenthesis of the cast.

src/simdutf/flb_simdutf_connector.cpp

leonardo-albertovich · 2025-05-14T11:10:23Z

src/simdutf/flb_simdutf_connector.cpp

+            aligned_input = (const char16_t *)input;
+        }
+        else {
+            str16.resize(len / 2);


According to the C++ reference we are missing some exception handling here.

Check, it might be be a similar amount of work (and more consistent with the rest of the C codebase) to just use flb_malloc here as well, since the simdutf functions used are noexcept-labelled.

cosmo0920 · 2025-05-27T06:11:42Z

Hi, could you rebase off the current master?
The weird errors on Windows should be fixed with the library syncs.

- Do not copy input if data is already aligned. - Only allocate output once. Signed-off-by: Erik Cederberg <erik.cederberg@sectra.com>

When converting UTF-16 to UTF-8, ingore the BOM so that no UTF-8 BOM is written to the output. Signed-off-by: Erik Cederberg <erik.cederberg@sectra.com>

If unicode input data is not converted, check if there is a UTF-8 BOM present and skip it. Signed-off-by: Erik Cederberg <erik.cederberg@sectra.com>

Signed-off-by: Erik Cederberg <erik.cederberg@sectra.com>

cosmo0920

Could you add static attributes for OSSFuzz related codes?

diff --git a/include/fluent-bit/flb_mem.h b/include/fluent-bit/flb_mem.h
index 3c83580db..ea5eb62cc 100644
--- a/include/fluent-bit/flb_mem.h
+++ b/include/fluent-bit/flb_mem.h
@@ -50,8 +50,8 @@
 /*
  * Return 1 or 0 based on a probability.
  */
-int flb_malloc_p;
-int flb_malloc_mod;
+static int flb_malloc_p;
+static int flb_malloc_mod;
 
 static inline int flb_fuzz_get_probability(int val) {
   flb_malloc_p += 1;

This could fix compilation errors for OSSFuzz.

Move definitions to its own file to avoid breaking the C++ one definition rule, improving the integration with the C++ code in flb-simdutf-connector. Signed-off-by: Erik Cederberg <erik.cederberg@sectra.com>

erikced · 2025-05-28T09:20:59Z

Could you add static attributes for OSSFuzz related codes?
...
This could fix compilation errors for OSSFuzz.

I did not see this until I after pushed my latest change, but I made them extern instead so that they would be shared, would not static make a copy per translation unit/object file, which is probably not what is wanted here?

cosmo0920 · 2025-05-28T10:21:00Z

Could you add static attributes for OSSFuzz related codes?
...
This could fix compilation errors for OSSFuzz.

I did not see this until I pushed my latest change, but I made them extern instead so that they would be shared, would not static make a copy per translation unit/object file, which is probably not what is wanted here?

Ah, yes. Using extern is better than my suggestion. Thanks for the improvement.

erikced requested review from edsiper, fujimotos, koleini and leonardo-albertovich as code owners May 13, 2025 09:56

github-actions bot added the docs-required label May 13, 2025

leonardo-albertovich assigned cosmo0920 May 13, 2025

cosmo0920 requested review from cosmo0920 and removed request for fujimotos May 13, 2025 10:42

erikced had a problem deploying to pr May 13, 2025 10:50 — with GitHub Actions Failure

cosmo0920 approved these changes May 13, 2025

View reviewed changes

cosmo0920 requested changes May 13, 2025

View reviewed changes

erikced had a problem deploying to pr May 13, 2025 15:27 — with GitHub Actions Failure

leonardo-albertovich suggested changes May 14, 2025

View reviewed changes

erikced force-pushed the simdutf-skip-bom branch from b96a988 to 7b26c51 Compare May 15, 2025 09:53

erikced had a problem deploying to pr May 15, 2025 10:22 — with GitHub Actions Failure

erikced force-pushed the simdutf-skip-bom branch from 7b26c51 to eb468c1 Compare May 15, 2025 10:59

erikced requested a review from leonardo-albertovich May 15, 2025 13:20

erikced had a problem deploying to pr May 19, 2025 07:50 — with GitHub Actions Failure

erikced added 4 commits May 27, 2025 09:37

simdutf_connector: reduce copying

10dd0e8

- Do not copy input if data is already aligned. - Only allocate output once. Signed-off-by: Erik Cederberg <erik.cederberg@sectra.com>

simdutf_connector: skip UTF-16 BOM

7685a31

When converting UTF-16 to UTF-8, ingore the BOM so that no UTF-8 BOM is written to the output. Signed-off-by: Erik Cederberg <erik.cederberg@sectra.com>

in_tail: detect, skip UTF-8 BOM

b8e2ee0

If unicode input data is not converted, check if there is a UTF-8 BOM present and skip it. Signed-off-by: Erik Cederberg <erik.cederberg@sectra.com>

build: link simdutf_connector with jemalloc

317d381

Signed-off-by: Erik Cederberg <erik.cederberg@sectra.com>

erikced force-pushed the simdutf-skip-bom branch from eb468c1 to 317d381 Compare May 27, 2025 10:12

erikced temporarily deployed to pr May 27, 2025 10:12 — with GitHub Actions Inactive

erikced temporarily deployed to pr May 27, 2025 10:28 — with GitHub Actions Inactive

erikced temporarily deployed to pr May 27, 2025 10:29 — with GitHub Actions Inactive

cosmo0920 requested changes May 28, 2025

View reviewed changes

mem: make flb_malloc_{p,mod} extern

592c96f

Move definitions to its own file to avoid breaking the C++ one definition rule, improving the integration with the C++ code in flb-simdutf-connector. Signed-off-by: Erik Cederberg <erik.cederberg@sectra.com>

erikced temporarily deployed to pr May 28, 2025 08:58 — with GitHub Actions Inactive

erikced temporarily deployed to pr May 28, 2025 09:15 — with GitHub Actions Inactive

cosmo0920 approved these changes May 28, 2025

View reviewed changes

cosmo0920 added this to the Fluent bit v4.0.3 milestone May 28, 2025

edsiper merged commit 455fc22 into fluent:master May 28, 2025
49 checks passed

BrewTestBot mentioned this pull request May 30, 2025

fluent-bit 4.0.3 Homebrew/homebrew-core#225287

Merged

Conversation

erikced commented May 13, 2025

Uh oh!

leonardo-albertovich commented May 13, 2025

Uh oh!

cosmo0920 left a comment

Choose a reason for hiding this comment

Uh oh!

cosmo0920 commented May 13, 2025

Uh oh!

cosmo0920 commented May 13, 2025

Uh oh!

cosmo0920 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

erikced commented May 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cosmo0920 commented May 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

leonardo-albertovich left a comment

Choose a reason for hiding this comment

Uh oh!

leonardo-albertovich May 14, 2025

Choose a reason for hiding this comment

Uh oh!

erikced May 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

leonardo-albertovich May 14, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

leonardo-albertovich May 14, 2025

Choose a reason for hiding this comment

Uh oh!

erikced May 14, 2025

Choose a reason for hiding this comment

Uh oh!

cosmo0920 commented May 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cosmo0920 left a comment

Choose a reason for hiding this comment

Uh oh!

erikced commented May 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cosmo0920 commented May 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

cosmo0920 left a comment •

edited

Loading

erikced commented May 13, 2025 •

edited

Loading

cosmo0920 commented May 14, 2025 •

edited

Loading

erikced May 14, 2025 •

edited

Loading

cosmo0920 commented May 27, 2025 •

edited

Loading

erikced commented May 28, 2025 •

edited

Loading

cosmo0920 commented May 28, 2025 •

edited

Loading